Model Selection

FP8 Dynamic Quantization

# FP8 Dynamic Quantization

Qwen3 30B A3B FP8 Dynamic

FP8 dynamic quantization version based on Qwen/Qwen3-30B-A3B model, optimized for inference efficiency on Ampere architecture GPUs

Large Language Model

Llama Joycaption Alpha Two Hf Llava FP8 Dynamic

This is an FP8 compressed version of the Llama JoyCaption Alpha Two model developed by fancyfeast, implemented using the llm-compressor tool and compatible with the vllm framework.

Image-to-Text English

Magnum V4 72b FP8 Dynamic

A large language model with 72B parameters fine - tuned based on Qwen2.5 - 72B - Instruct. It uses dynamic FP8 quantization technology to optimize inference efficiency and aims to reproduce the prose quality of Claude 3.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase